Overview
Dataset statistics
| Number of variables | 49 |
|---|---|
| Number of observations | 2751 |
| Missing cells | 28860 |
| Missing cells (%) | 21.4% |
| Duplicate rows | 5 |
| Duplicate rows (%) | 0.2% |
| Total size in memory | 3.6 MiB |
| Average record size in memory | 1.3 KiB |
Variable types
| Categorical | 32 |
|---|---|
| DateTime | 2 |
| Numeric | 14 |
| Boolean | 1 |
Dataset
| Description | JHB_Aurum_009 - Quality-corrected harmonized data |
|---|---|
| Creator | RP2 Clinical Data Quality Team |
| Author | Quality-Checked Data |
| URL | HEAT Research Projects |
Variable descriptions
| Age (at enrolment) | Patient age at study enrollment |
|---|---|
| Sex | Biological sex |
| Race | Racial/ethnic group |
| CD4 cell count (cells/µL) | CD4+ T lymphocyte count (missing codes removed) |
| HIV viral load (copies/mL) | HIV RNA copies per mL (missing codes removed) |
| Antiretroviral Therapy Status | Current ART status |
| BMI (kg/m²) | Body Mass Index (extreme values removed) |
| Waist circumference (cm) | Waist circumference (corrected from mm to cm) |
| weight_kg | Body weight in kilograms |
| height_m | Height in meters |
| Hematocrit (%) | Hematocrit (zero values removed) |
| hemoglobin_g_dL | Hemoglobin concentration |
| White blood cell count (×10³/µL) | Total WBC count |
| Red blood cell count (×10⁶/µL) | Total RBC count |
| Platelet count (×10³/µL) | Platelet count (missing codes removed) |
| MCV (MEAN CELL VOLUME) | Mean corpuscular volume |
| mch_pg | Mean corpuscular hemoglobin |
| mchc_g_dL | Mean corpuscular hemoglobin concentration |
| RDW | Red cell distribution width |
| Lymphocyte count (×10⁹/L) | Lymphocyte absolute count (corrected labeling) |
| Neutrophil count (×10⁹/L) | Neutrophil absolute count (corrected labeling) |
| Monocyte count (×10⁹/L) | Monocyte absolute count (corrected labeling) |
| Eosinophil count (×10⁹/L) | Eosinophil absolute count (corrected labeling) |
| Basophil count (×10⁹/L) | Basophil absolute count (corrected labeling) |
| ALT (U/L) | Alanine aminotransferase (missing codes removed) |
| AST (U/L) | Aspartate aminotransferase |
| Alkaline phosphatase (U/L) | Alkaline phosphatase |
| Total bilirubin (mg/dL) | Total bilirubin |
| Albumin (g/dL) | Serum albumin |
| Total protein (g/dL) | Total serum protein |
| creatinine_umol_L | Serum creatinine |
| creatinine clearance | Estimated creatinine clearance |
| Sodium (mEq/L) | Serum sodium |
| Potassium (mEq/L) | Serum potassium |
| fasting_glucose_mmol_L | Fasting blood glucose |
| total_cholesterol_mg_dL | Total cholesterol |
| hdl_cholesterol_mg_dL | HDL cholesterol |
| ldl_cholesterol_mg_dL | LDL cholesterol |
| Triglycerides (mg/dL) | Triglycerides |
| systolic_bp_mmHg | Systolic blood pressure |
| diastolic_bp_mmHg | Diastolic blood pressure |
| heart_rate_bpm | Heart rate (zero values removed) |
| Respiratory rate (breaths/min) | Respiratory rate |
| Oxygen saturation (%) | Oxygen saturation |
| body_temperature_celsius | Body temperature |
| climate_daily_mean_temp | Daily mean temperature |
| climate_daily_max_temp | Daily maximum temperature |
| climate_temp_anomaly | Temperature anomaly from baseline |
| climate_heat_day_p90 | Heat day indicator (>90th percentile) |
| climate_heat_stress_index | Heat stress index |
| cd4_correction_applied | Quality flag: CD4 missing codes removed |
| final_comprehensive_fix_applied | Quality flag: Comprehensive corrections applied |
| waist_circ_unit_correction_applied | Quality flag: Waist circ unit corrected |
| sa_biomarker_standards | South African biomarker reference standards |
study_source has constant value "JHB_Aurum_009" | Constant |
latitude has constant value "-25.7479" | Constant |
longitude has constant value "28.2293" | Constant |
jhb_subregion has constant value "Eastern_JHB" | Constant |
city has constant value "Johannesburg" | Constant |
province has constant value "Gauteng" | Constant |
country has constant value "South Africa" | Constant |
Country has constant value "South Africa" | Constant |
Clinical Study ID has constant value "Tholimpilo_HIV_Linkage_Study" | Constant |
Location of study follow-up has constant value "Aurum Institute - Multi-site Gauteng and Limpopo" | Constant |
coordinate_source has constant value "JHB_Aurum_009" | Constant |
coordinate_precision has constant value "high" | Constant |
geographic_source has constant value "harmonized_datasets" | Constant |
HIV_status has constant value "Positive" | Constant |
johannesburg_metro_valid has constant value "1.0" | Constant |
study_site_location has constant value "Tembisa/East Rand (Aurum Institute)" | Constant |
climate_p90_threshold has constant value "28.409" | Constant |
climate_p95_threshold has constant value "29.704" | Constant |
climate_p99_threshold has constant value "31.797" | Constant |
sa_biomarker_standards has constant value "1.0" | Constant |
final_comprehensive_fix_applied has constant value "1.0" | Constant |
total_protein_extreme_flag has constant value "0.0" | Constant |
dphru_053_final_corrections_applied has constant value "0.0" | Constant |
ezin_002_final_corrections_applied has constant value "0.0" | Constant |
quality_harmonization_version has constant value "2.0" | Constant |
waist_circ_unit_correction_applied has constant value "False" | Constant |
| Dataset has 5 (0.2%) duplicate rows | Duplicates |
CD4 cell count (cells/µL) is highly overall correlated with cd4_correction_applied | High correlation |
cd4_correction_applied is highly overall correlated with CD4 cell count (cells/µL) | High correlation |
climate_14d_mean_temp is highly overall correlated with climate_30d_mean_temp and 11 other fields | High correlation |
climate_30d_mean_temp is highly overall correlated with climate_14d_mean_temp and 11 other fields | High correlation |
climate_7d_max_temp is highly overall correlated with climate_14d_mean_temp and 7 other fields | High correlation |
climate_7d_mean_temp is highly overall correlated with climate_14d_mean_temp and 11 other fields | High correlation |
climate_daily_max_temp is highly overall correlated with climate_14d_mean_temp and 11 other fields | High correlation |
climate_daily_mean_temp is highly overall correlated with climate_14d_mean_temp and 12 other fields | High correlation |
climate_daily_min_temp is highly overall correlated with climate_14d_mean_temp and 11 other fields | High correlation |
climate_heat_day_p90 is highly overall correlated with climate_14d_mean_temp and 12 other fields | High correlation |
climate_heat_day_p95 is highly overall correlated with climate_14d_mean_temp and 12 other fields | High correlation |
climate_heat_stress_index is highly overall correlated with climate_14d_mean_temp and 11 other fields | High correlation |
climate_season is highly overall correlated with climate_14d_mean_temp and 14 other fields | High correlation |
climate_standardized_anomaly is highly overall correlated with climate_daily_mean_temp and 4 other fields | High correlation |
climate_temp_anomaly is highly overall correlated with climate_heat_day_p90 and 6 other fields | High correlation |
month is highly overall correlated with climate_heat_day_p90 and 4 other fields | High correlation |
season is highly overall correlated with climate_14d_mean_temp and 13 other fields | High correlation |
year is highly overall correlated with climate_14d_mean_temp and 10 other fields | High correlation |
climate_heat_day_p90 is highly imbalanced (69.4%) | Imbalance |
climate_heat_day_p95 is highly imbalanced (69.4%) | Imbalance |
cd4_correction_applied is highly imbalanced (85.9%) | Imbalance |
CD4 cell count (cells/µL) has 533 (19.4%) missing values | Missing |
HIV viral load (copies/mL) has 2461 (89.5%) missing values | Missing |
climate_daily_mean_temp has 1616 (58.7%) missing values | Missing |
climate_daily_max_temp has 1616 (58.7%) missing values | Missing |
climate_daily_min_temp has 1616 (58.7%) missing values | Missing |
climate_7d_mean_temp has 1616 (58.7%) missing values | Missing |
climate_7d_max_temp has 1616 (58.7%) missing values | Missing |
climate_14d_mean_temp has 1616 (58.7%) missing values | Missing |
climate_30d_mean_temp has 1616 (58.7%) missing values | Missing |
climate_temp_anomaly has 1616 (58.7%) missing values | Missing |
climate_standardized_anomaly has 1616 (58.7%) missing values | Missing |
climate_heat_day_p90 has 1616 (58.7%) missing values | Missing |
climate_heat_day_p95 has 1616 (58.7%) missing values | Missing |
climate_heat_stress_index has 1616 (58.7%) missing values | Missing |
climate_p90_threshold has 1616 (58.7%) missing values | Missing |
climate_p95_threshold has 1616 (58.7%) missing values | Missing |
climate_p99_threshold has 1616 (58.7%) missing values | Missing |
climate_season has 1616 (58.7%) missing values | Missing |
HIV viral load (copies/mL) has 246 (8.9%) zeros | Zeros |
Reproduction
| Analysis started | 2025-11-24 22:05:33.899395 |
|---|---|
| Analysis finished | 2025-11-24 22:05:42.186232 |
| Duration | 8.29 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
study_source
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 188.1 KiB |
| JHB_Aurum_009 |
|---|
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 13 |
| Min length | 13 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | JHB_Aurum_009 |
|---|---|
| 2nd row | JHB_Aurum_009 |
| 3rd row | JHB_Aurum_009 |
| 4th row | JHB_Aurum_009 |
| 5th row | JHB_Aurum_009 |
Common Values
| Value | Count | Frequency (%) |
| JHB_Aurum_009 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| jhb_aurum_009 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| _ | 5502 | |
| u | 5502 | |
| 0 | 5502 | |
| J | 2751 | |
| H | 2751 | |
| B | 2751 | |
| A | 2751 | |
| r | 2751 | |
| m | 2751 | |
| 9 | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11004 | |
| Uppercase Letter | 11004 | |
| Decimal Number | 8253 | |
| Connector Punctuation | 5502 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 2751 | |
| H | 2751 | |
| B | 2751 | |
| A | 2751 |
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 5502 | |
| r | 2751 | |
| m | 2751 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| 9 | 2751 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 5502 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 22008 | |
| Common | 13755 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| u | 5502 | |
| J | 2751 | |
| H | 2751 | |
| B | 2751 | |
| A | 2751 | |
| r | 2751 | |
| m | 2751 |
Common
| Value | Count | Frequency (%) |
| _ | 5502 | |
| 0 | 5502 | |
| 9 | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 35763 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| _ | 5502 | |
| u | 5502 | |
| 0 | 5502 | |
| J | 2751 | |
| H | 2751 | |
| B | 2751 | |
| A | 2751 | |
| r | 2751 | |
| m | 2751 | |
| 9 | 2751 |
primary_date
Date
| Distinct | 447 |
|---|---|
| Distinct (%) | 16.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 43.0 KiB |
| Minimum | 2013-03-14 00:00:00 |
|---|---|
| Maximum | 2015-08-01 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
year
Categorical
High correlation
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 169.3 KiB |
| 2014.0 | |
|---|---|
| 2013.0 | |
| 2015.0 | 1 |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 2014.0 |
|---|---|
| 2nd row | 2014.0 |
| 3rd row | 2014.0 |
| 4th row | 2014.0 |
| 5th row | 2013.0 |
Common Values
| Value | Count | Frequency (%) |
| 2014.0 | 1677 | |
| 2013.0 | 1073 | |
| 2015.0 | 1 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2014.0 | 1677 | |
| 2013.0 | 1073 | |
| 2015.0 | 1 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| 2 | 2751 | |
| 1 | 2751 | |
| . | 2751 | |
| 4 | 1677 | 10.2% |
| 3 | 1073 | 6.5% |
| 5 | 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 13755 | |
| Other Punctuation | 2751 | 16.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| 2 | 2751 | |
| 1 | 2751 | |
| 4 | 1677 | 12.2% |
| 3 | 1073 | 7.8% |
| 5 | 1 | < 0.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 16506 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| 2 | 2751 | |
| 1 | 2751 | |
| . | 2751 | |
| 4 | 1677 | 10.2% |
| 3 | 1073 | 6.5% |
| 5 | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16506 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| 2 | 2751 | |
| 1 | 2751 | |
| . | 2751 | |
| 4 | 1677 | 10.2% |
| 3 | 1073 | 6.5% |
| 5 | 1 | < 0.1% |
month
Real number (ℝ)
High correlation
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.9465649 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 7 |
| Q3 | 10 |
| 95-th percentile | 11 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.9715711 |
|---|---|
| Coefficient of variation (CV) | 0.42777561 |
| Kurtosis | -1.0043626 |
| Mean | 6.9465649 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.30693242 |
| Sum | 19110 |
| Variance | 8.8302346 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 413 | |
| 7 | 373 | |
| 9 | 321 | |
| 8 | 273 | |
| 2 | 250 | |
| 5 | 216 | |
| 11 | 215 | |
| 4 | 205 | |
| 6 | 199 | |
| 3 | 157 | 5.7% |
| Other values (2) | 129 | 4.7% |
| Value | Count | Frequency (%) |
| 1 | 62 | 2.3% |
| 2 | 250 | |
| 3 | 157 | 5.7% |
| 4 | 205 | |
| 5 | 216 | |
| 6 | 199 | |
| 7 | 373 | |
| 8 | 273 | |
| 9 | 321 | |
| 10 | 413 |
| Value | Count | Frequency (%) |
| 12 | 67 | 2.4% |
| 11 | 215 | |
| 10 | 413 | |
| 9 | 321 | |
| 8 | 273 | |
| 7 | 373 | |
| 6 | 199 | |
| 5 | 216 | |
| 4 | 205 | |
| 3 | 157 | 5.7% |
season
Categorical
High correlation
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 169.3 KiB |
| Spring | |
|---|---|
| Winter | |
| Autumn | |
| Summer |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Summer |
|---|---|
| 2nd row | Autumn |
| 3rd row | Winter |
| 4th row | Autumn |
| 5th row | Autumn |
Common Values
| Value | Count | Frequency (%) |
| Spring | 949 | |
| Winter | 845 | |
| Autumn | 578 | |
| Summer | 379 | 13.8% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| spring | 949 | |
| winter | 845 | |
| autumn | 578 | |
| summer | 379 | 13.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 2372 | |
| r | 2173 | |
| i | 1794 | |
| u | 1535 | |
| t | 1423 | |
| m | 1336 | |
| S | 1328 | |
| e | 1224 | |
| p | 949 | |
| g | 949 | |
| Other values (2) | 1423 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 13755 | |
| Uppercase Letter | 2751 | 16.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 2372 | |
| r | 2173 | |
| i | 1794 | |
| u | 1535 | |
| t | 1423 | |
| m | 1336 | |
| e | 1224 | |
| p | 949 | |
| g | 949 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1328 | |
| W | 845 | |
| A | 578 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 16506 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 2372 | |
| r | 2173 | |
| i | 1794 | |
| u | 1535 | |
| t | 1423 | |
| m | 1336 | |
| S | 1328 | |
| e | 1224 | |
| p | 949 | |
| g | 949 | |
| Other values (2) | 1423 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 16506 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 2372 | |
| r | 2173 | |
| i | 1794 | |
| u | 1535 | |
| t | 1423 | |
| m | 1336 | |
| S | 1328 | |
| e | 1224 | |
| p | 949 | |
| g | 949 | |
| Other values (2) | 1423 |
latitude
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 174.6 KiB |
| -25.7479 |
|---|
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | -25.7479 |
|---|---|
| 2nd row | -25.7479 |
| 3rd row | -25.7479 |
| 4th row | -25.7479 |
| 5th row | -25.7479 |
Common Values
| Value | Count | Frequency (%) |
| -25.7479 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 25.7479 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 7 | 5502 | |
| - | 2751 | |
| 2 | 2751 | |
| 5 | 2751 | |
| . | 2751 | |
| 4 | 2751 | |
| 9 | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 16506 | |
| Dash Punctuation | 2751 | 12.5% |
| Other Punctuation | 2751 | 12.5% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 7 | 5502 | |
| 2 | 2751 | |
| 5 | 2751 | |
| 4 | 2751 | |
| 9 | 2751 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2751 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 22008 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 7 | 5502 | |
| - | 2751 | |
| 2 | 2751 | |
| 5 | 2751 | |
| . | 2751 | |
| 4 | 2751 | |
| 9 | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 22008 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 7 | 5502 | |
| - | 2751 | |
| 2 | 2751 | |
| 5 | 2751 | |
| . | 2751 | |
| 4 | 2751 | |
| 9 | 2751 |
longitude
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 171.9 KiB |
| 28.2293 |
|---|
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 28.2293 |
|---|---|
| 2nd row | 28.2293 |
| 3rd row | 28.2293 |
| 4th row | 28.2293 |
| 5th row | 28.2293 |
Common Values
| Value | Count | Frequency (%) |
| 28.2293 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 28.2293 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 8253 | |
| 8 | 2751 | 14.3% |
| . | 2751 | 14.3% |
| 9 | 2751 | 14.3% |
| 3 | 2751 | 14.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 16506 | |
| Other Punctuation | 2751 | 14.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 8253 | |
| 8 | 2751 | 16.7% |
| 9 | 2751 | 16.7% |
| 3 | 2751 | 16.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 19257 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 8253 | |
| 8 | 2751 | 14.3% |
| . | 2751 | 14.3% |
| 9 | 2751 | 14.3% |
| 3 | 2751 | 14.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19257 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 8253 | |
| 8 | 2751 | 14.3% |
| . | 2751 | 14.3% |
| 9 | 2751 | 14.3% |
| 3 | 2751 | 14.3% |
jhb_subregion
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 182.7 KiB |
| Eastern_JHB |
|---|
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 11 |
| Min length | 11 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Eastern_JHB |
|---|---|
| 2nd row | Eastern_JHB |
| 3rd row | Eastern_JHB |
| 4th row | Eastern_JHB |
| 5th row | Eastern_JHB |
Common Values
| Value | Count | Frequency (%) |
| Eastern_JHB | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| eastern_jhb | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 2751 | |
| a | 2751 | |
| s | 2751 | |
| t | 2751 | |
| e | 2751 | |
| r | 2751 | |
| n | 2751 | |
| _ | 2751 | |
| J | 2751 | |
| H | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 16506 | |
| Uppercase Letter | 11004 | |
| Connector Punctuation | 2751 | 9.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2751 | |
| s | 2751 | |
| t | 2751 | |
| e | 2751 | |
| r | 2751 | |
| n | 2751 |
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 2751 | |
| J | 2751 | |
| H | 2751 | |
| B | 2751 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 27510 | |
| Common | 2751 | 9.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 2751 | |
| a | 2751 | |
| s | 2751 | |
| t | 2751 | |
| e | 2751 | |
| r | 2751 | |
| n | 2751 | |
| J | 2751 | |
| H | 2751 | |
| B | 2751 |
Common
| Value | Count | Frequency (%) |
| _ | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 30261 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| E | 2751 | |
| a | 2751 | |
| s | 2751 | |
| t | 2751 | |
| e | 2751 | |
| r | 2751 | |
| n | 2751 | |
| _ | 2751 | |
| J | 2751 | |
| H | 2751 |
city
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 185.4 KiB |
| Johannesburg |
|---|
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 12 |
| Min length | 12 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Johannesburg |
|---|---|
| 2nd row | Johannesburg |
| 3rd row | Johannesburg |
| 4th row | Johannesburg |
| 5th row | Johannesburg |
Common Values
| Value | Count | Frequency (%) |
| Johannesburg | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| johannesburg | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 5502 | |
| J | 2751 | |
| o | 2751 | |
| h | 2751 | |
| a | 2751 | |
| e | 2751 | |
| s | 2751 | |
| b | 2751 | |
| u | 2751 | |
| r | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 30261 | |
| Uppercase Letter | 2751 | 8.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 5502 | |
| o | 2751 | |
| h | 2751 | |
| a | 2751 | |
| e | 2751 | |
| s | 2751 | |
| b | 2751 | |
| u | 2751 | |
| r | 2751 | |
| g | 2751 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 33012 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 5502 | |
| J | 2751 | |
| o | 2751 | |
| h | 2751 | |
| a | 2751 | |
| e | 2751 | |
| s | 2751 | |
| b | 2751 | |
| u | 2751 | |
| r | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 33012 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 5502 | |
| J | 2751 | |
| o | 2751 | |
| h | 2751 | |
| a | 2751 | |
| e | 2751 | |
| s | 2751 | |
| b | 2751 | |
| u | 2751 | |
| r | 2751 |
province
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 171.9 KiB |
| Gauteng |
|---|
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Gauteng |
|---|---|
| 2nd row | Gauteng |
| 3rd row | Gauteng |
| 4th row | Gauteng |
| 5th row | Gauteng |
Common Values
| Value | Count | Frequency (%) |
| Gauteng | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| gauteng | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| G | 2751 | |
| a | 2751 | |
| u | 2751 | |
| t | 2751 | |
| e | 2751 | |
| n | 2751 | |
| g | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 16506 | |
| Uppercase Letter | 2751 | 14.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 2751 | |
| u | 2751 | |
| t | 2751 | |
| e | 2751 | |
| n | 2751 | |
| g | 2751 |
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 19257 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 2751 | |
| a | 2751 | |
| u | 2751 | |
| t | 2751 | |
| e | 2751 | |
| n | 2751 | |
| g | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 19257 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| G | 2751 | |
| a | 2751 | |
| u | 2751 | |
| t | 2751 | |
| e | 2751 | |
| n | 2751 | |
| g | 2751 |
country
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 185.4 KiB |
| South Africa |
|---|
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 12 |
| Min length | 12 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | South Africa |
|---|---|
| 2nd row | South Africa |
| 3rd row | South Africa |
| 4th row | South Africa |
| 5th row | South Africa |
Common Values
| Value | Count | Frequency (%) |
| South Africa | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| south | 2751 | |
| africa | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 2751 | |
| o | 2751 | |
| u | 2751 | |
| t | 2751 | |
| h | 2751 | |
| 2751 | ||
| A | 2751 | |
| f | 2751 | |
| r | 2751 | |
| i | 2751 | |
| Other values (2) | 5502 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 24759 | |
| Uppercase Letter | 5502 | 16.7% |
| Space Separator | 2751 | 8.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 2751 | |
| u | 2751 | |
| t | 2751 | |
| h | 2751 | |
| f | 2751 | |
| r | 2751 | |
| i | 2751 | |
| c | 2751 | |
| a | 2751 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 2751 | |
| A | 2751 |
Space Separator
| Value | Count | Frequency (%) |
| 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 30261 | |
| Common | 2751 | 8.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 2751 | |
| o | 2751 | |
| u | 2751 | |
| t | 2751 | |
| h | 2751 | |
| A | 2751 | |
| f | 2751 | |
| r | 2751 | |
| i | 2751 | |
| c | 2751 |
Common
| Value | Count | Frequency (%) |
| 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 33012 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| S | 2751 | |
| o | 2751 | |
| u | 2751 | |
| t | 2751 | |
| h | 2751 | |
| 2751 | ||
| A | 2751 | |
| f | 2751 | |
| r | 2751 | |
| i | 2751 | |
| Other values (2) | 5502 |
Age (at enrolment)
Real number (ℝ)
Patient age at study enrollment
| Distinct | 59 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 6 |
| Missing (%) | 0.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34.426958 |
| Minimum | 15 |
|---|---|
| Maximum | 76 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 15 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 27 |
| median | 33 |
| Q3 | 40 |
| 95-th percentile | 54 |
| Maximum | 76 |
| Range | 61 |
| Interquartile range (IQR) | 13 |
Descriptive statistics
| Standard deviation | 10.178108 |
|---|---|
| Coefficient of variation (CV) | 0.29564354 |
| Kurtosis | 0.24473046 |
| Mean | 34.426958 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.70885633 |
| Sum | 94502 |
| Variance | 103.59388 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 31 | 125 | 4.5% |
| 30 | 117 | 4.3% |
| 29 | 116 | 4.2% |
| 28 | 113 | 4.1% |
| 27 | 108 | 3.9% |
| 32 | 106 | 3.9% |
| 26 | 104 | 3.8% |
| 34 | 102 | 3.7% |
| 24 | 101 | 3.7% |
| 33 | 97 | 3.5% |
| Other values (49) | 1656 |
| Value | Count | Frequency (%) |
| 15 | 4 | 0.1% |
| 16 | 3 | 0.1% |
| 17 | 15 | 0.5% |
| 18 | 24 | 0.9% |
| 19 | 40 | 1.5% |
| 20 | 59 | |
| 21 | 56 | |
| 22 | 73 | |
| 23 | 85 | |
| 24 | 101 |
| Value | Count | Frequency (%) |
| 76 | 1 | < 0.1% |
| 74 | 1 | < 0.1% |
| 72 | 2 | 0.1% |
| 71 | 1 | < 0.1% |
| 70 | 1 | < 0.1% |
| 69 | 2 | 0.1% |
| 68 | 3 | |
| 67 | 1 | < 0.1% |
| 66 | 1 | < 0.1% |
| 65 | 5 |
Sex
Categorical
Biological sex
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 4 |
| Missing (%) | 0.1% |
| Memory size | 165.9 KiB |
| Male | |
|---|---|
| Female |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.7564616 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Female |
|---|---|
| 2nd row | Female |
| 3rd row | Male |
| 4th row | Male |
| 5th row | Female |
Common Values
| Value | Count | Frequency (%) |
| Male | 1708 | |
| Female | 1039 | |
| (Missing) | 4 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| male | 1708 | |
| female | 1039 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 3786 | |
| a | 2747 | |
| l | 2747 | |
| M | 1708 | |
| F | 1039 | 8.0% |
| m | 1039 | 8.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10319 | |
| Uppercase Letter | 2747 | 21.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 3786 | |
| a | 2747 | |
| l | 2747 | |
| m | 1039 | 10.1% |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 1708 | |
| F | 1039 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 13066 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 3786 | |
| a | 2747 | |
| l | 2747 | |
| M | 1708 | |
| F | 1039 | 8.0% |
| m | 1039 | 8.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 13066 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 3786 | |
| a | 2747 | |
| l | 2747 | |
| M | 1708 | |
| F | 1039 | 8.0% |
| m | 1039 | 8.0% |
CD4 cell count (cells/µL)
Real number (ℝ)
High correlation Missing
CD4+ T lymphocyte count (missing codes removed)
| Distinct | 854 |
|---|---|
| Distinct (%) | 38.5% |
| Missing | 533 |
| Missing (%) | 19.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 456.95807 |
| Minimum | 3 |
|---|---|
| Maximum | 2703 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 108.85 |
| Q1 | 272 |
| median | 416 |
| Q3 | 589 |
| 95-th percentile | 937 |
| Maximum | 2703 |
| Range | 2700 |
| Interquartile range (IQR) | 317 |
Descriptive statistics
| Standard deviation | 268.47946 |
|---|---|
| Coefficient of variation (CV) | 0.58753632 |
| Kurtosis | 7.1691831 |
| Mean | 456.95807 |
| Median Absolute Deviation (MAD) | 155 |
| Skewness | 1.6497118 |
| Sum | 1013533 |
| Variance | 72081.223 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 350 | 9 | 0.3% |
| 315 | 9 | 0.3% |
| 500 | 9 | 0.3% |
| 467 | 9 | 0.3% |
| 420 | 8 | 0.3% |
| 336 | 8 | 0.3% |
| 443 | 8 | 0.3% |
| 354 | 8 | 0.3% |
| 414 | 8 | 0.3% |
| 564 | 8 | 0.3% |
| Other values (844) | 2134 | |
| (Missing) | 533 | 19.4% |
| Value | Count | Frequency (%) |
| 3 | 2 | |
| 6 | 1 | |
| 8 | 1 | |
| 10 | 1 | |
| 15 | 1 | |
| 16 | 1 | |
| 20 | 1 | |
| 21 | 1 | |
| 28 | 1 | |
| 29 | 1 |
| Value | Count | Frequency (%) |
| 2703 | 1 | |
| 2609 | 2 | |
| 1996 | 1 | |
| 1781 | 1 | |
| 1725 | 1 | |
| 1577 | 1 | |
| 1568 | 1 | |
| 1564 | 1 | |
| 1549 | 1 | |
| 1508 | 1 |
HIV viral load (copies/mL)
Real number (ℝ)
Missing Zeros
HIV RNA copies per mL (missing codes removed)
| Distinct | 45 |
|---|---|
| Distinct (%) | 15.5% |
| Missing | 2461 |
| Missing (%) | 89.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20363.586 |
| Minimum | 0 |
|---|---|
| Maximum | 2670000 |
| Zeros | 246 |
| Zeros (%) | 8.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 7860.2 |
| Maximum | 2670000 |
| Range | 2670000 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 196029.65 |
|---|---|
| Coefficient of variation (CV) | 9.6264796 |
| Kurtosis | 145.0072 |
| Mean | 20363.586 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 11.783887 |
| Sum | 5905440 |
| Variance | 3.8427622 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 246 | 8.9% |
| 8555 | 1 | < 0.1% |
| 378 | 1 | < 0.1% |
| 2435 | 1 | < 0.1% |
| 6442 | 1 | < 0.1% |
| 13795 | 1 | < 0.1% |
| 200 | 1 | < 0.1% |
| 31 | 1 | < 0.1% |
| 1898105 | 1 | < 0.1% |
| 132 | 1 | < 0.1% |
| Other values (35) | 35 | 1.3% |
| (Missing) | 2461 |
| Value | Count | Frequency (%) |
| 0 | 246 | |
| 10 | 1 | < 0.1% |
| 31 | 1 | < 0.1% |
| 51 | 1 | < 0.1% |
| 74 | 1 | < 0.1% |
| 82 | 1 | < 0.1% |
| 87 | 1 | < 0.1% |
| 132 | 1 | < 0.1% |
| 143 | 1 | < 0.1% |
| 174 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 2670000 | 1 | |
| 1898105 | 1 | |
| 650442 | 1 | |
| 164351 | 1 | |
| 149247 | 1 | |
| 125054 | 1 | |
| 44011 | 1 | |
| 38500 | 1 | |
| 34868 | 1 | |
| 22276 | 1 |
date
Date
| Distinct | 447 |
|---|---|
| Distinct (%) | 16.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 43.0 KiB |
| Minimum | 2013-03-14 00:00:00 |
|---|---|
| Maximum | 2015-08-01 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
Country
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 185.4 KiB |
| South Africa |
|---|
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 12 |
| Min length | 12 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | South Africa |
|---|---|
| 2nd row | South Africa |
| 3rd row | South Africa |
| 4th row | South Africa |
| 5th row | South Africa |
Common Values
| Value | Count | Frequency (%) |
| South Africa | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| south | 2751 | |
| africa | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 2751 | |
| o | 2751 | |
| u | 2751 | |
| t | 2751 | |
| h | 2751 | |
| 2751 | ||
| A | 2751 | |
| f | 2751 | |
| r | 2751 | |
| i | 2751 | |
| Other values (2) | 5502 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 24759 | |
| Uppercase Letter | 5502 | 16.7% |
| Space Separator | 2751 | 8.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| o | 2751 | |
| u | 2751 | |
| t | 2751 | |
| h | 2751 | |
| f | 2751 | |
| r | 2751 | |
| i | 2751 | |
| c | 2751 | |
| a | 2751 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 2751 | |
| A | 2751 |
Space Separator
| Value | Count | Frequency (%) |
| 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 30261 | |
| Common | 2751 | 8.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 2751 | |
| o | 2751 | |
| u | 2751 | |
| t | 2751 | |
| h | 2751 | |
| A | 2751 | |
| f | 2751 | |
| r | 2751 | |
| i | 2751 | |
| c | 2751 |
Common
| Value | Count | Frequency (%) |
| 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 33012 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| S | 2751 | |
| o | 2751 | |
| u | 2751 | |
| t | 2751 | |
| h | 2751 | |
| 2751 | ||
| A | 2751 | |
| f | 2751 | |
| r | 2751 | |
| i | 2751 | |
| Other values (2) | 5502 |
Clinical Study ID
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 228.4 KiB |
| Tholimpilo_HIV_Linkage_Study |
|---|
Length
| Max length | 28 |
|---|---|
| Median length | 28 |
| Mean length | 28 |
| Min length | 28 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Tholimpilo_HIV_Linkage_Study |
|---|---|
| 2nd row | Tholimpilo_HIV_Linkage_Study |
| 3rd row | Tholimpilo_HIV_Linkage_Study |
| 4th row | Tholimpilo_HIV_Linkage_Study |
| 5th row | Tholimpilo_HIV_Linkage_Study |
Common Values
| Value | Count | Frequency (%) |
| Tholimpilo_HIV_Linkage_Study | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| tholimpilo_hiv_linkage_study | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 8253 | 10.7% |
| _ | 8253 | 10.7% |
| o | 5502 | 7.1% |
| l | 5502 | 7.1% |
| T | 2751 | 3.6% |
| k | 2751 | 3.6% |
| d | 2751 | 3.6% |
| u | 2751 | 3.6% |
| t | 2751 | 3.6% |
| S | 2751 | 3.6% |
| Other values (12) | 33012 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 52269 | |
| Uppercase Letter | 16506 | 21.4% |
| Connector Punctuation | 8253 | 10.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 8253 | |
| o | 5502 | 10.5% |
| l | 5502 | 10.5% |
| k | 2751 | 5.3% |
| d | 2751 | 5.3% |
| u | 2751 | 5.3% |
| t | 2751 | 5.3% |
| e | 2751 | 5.3% |
| g | 2751 | 5.3% |
| a | 2751 | 5.3% |
| Other values (5) | 13755 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 2751 | |
| S | 2751 | |
| L | 2751 | |
| V | 2751 | |
| I | 2751 | |
| H | 2751 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 8253 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 68775 | |
| Common | 8253 | 10.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 8253 | 12.0% |
| o | 5502 | 8.0% |
| l | 5502 | 8.0% |
| T | 2751 | 4.0% |
| k | 2751 | 4.0% |
| d | 2751 | 4.0% |
| u | 2751 | 4.0% |
| t | 2751 | 4.0% |
| S | 2751 | 4.0% |
| e | 2751 | 4.0% |
| Other values (11) | 30261 |
Common
| Value | Count | Frequency (%) |
| _ | 8253 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 77028 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 8253 | 10.7% |
| _ | 8253 | 10.7% |
| o | 5502 | 7.1% |
| l | 5502 | 7.1% |
| T | 2751 | 3.6% |
| k | 2751 | 3.6% |
| d | 2751 | 3.6% |
| u | 2751 | 3.6% |
| t | 2751 | 3.6% |
| S | 2751 | 3.6% |
| Other values (12) | 33012 |
Location of study follow-up
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 282.1 KiB |
| Aurum Institute - Multi-site Gauteng and Limpopo |
|---|
Length
| Max length | 48 |
|---|---|
| Median length | 48 |
| Mean length | 48 |
| Min length | 48 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Aurum Institute - Multi-site Gauteng and Limpopo |
|---|---|
| 2nd row | Aurum Institute - Multi-site Gauteng and Limpopo |
| 3rd row | Aurum Institute - Multi-site Gauteng and Limpopo |
| 4th row | Aurum Institute - Multi-site Gauteng and Limpopo |
| 5th row | Aurum Institute - Multi-site Gauteng and Limpopo |
Common Values
| Value | Count | Frequency (%) |
| Aurum Institute - Multi-site Gauteng and Limpopo | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| aurum | 2751 | |
| institute | 2751 | |
| 2751 | ||
| multi-site | 2751 | |
| gauteng | 2751 | |
| and | 2751 | |
| limpopo | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 16506 | ||
| t | 16506 | |
| u | 13755 | 10.4% |
| i | 11004 | 8.3% |
| e | 8253 | 6.2% |
| n | 8253 | 6.2% |
| p | 5502 | 4.2% |
| a | 5502 | 4.2% |
| - | 5502 | 4.2% |
| o | 5502 | 4.2% |
| Other values (11) | 35763 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 96285 | |
| Space Separator | 16506 | 12.5% |
| Uppercase Letter | 13755 | 10.4% |
| Dash Punctuation | 5502 | 4.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 16506 | |
| u | 13755 | |
| i | 11004 | |
| e | 8253 | |
| n | 8253 | |
| p | 5502 | 5.7% |
| a | 5502 | 5.7% |
| o | 5502 | 5.7% |
| s | 5502 | 5.7% |
| m | 5502 | 5.7% |
| Other values (4) | 11004 |
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 2751 | |
| M | 2751 | |
| G | 2751 | |
| L | 2751 | |
| A | 2751 |
Space Separator
| Value | Count | Frequency (%) |
| 16506 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 5502 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 110040 | |
| Common | 22008 | 16.7% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 16506 | |
| u | 13755 | |
| i | 11004 | |
| e | 8253 | 7.5% |
| n | 8253 | 7.5% |
| p | 5502 | 5.0% |
| a | 5502 | 5.0% |
| o | 5502 | 5.0% |
| s | 5502 | 5.0% |
| m | 5502 | 5.0% |
| Other values (9) | 24759 |
Common
| Value | Count | Frequency (%) |
| 16506 | ||
| - | 5502 | 25.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 132048 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 16506 | ||
| t | 16506 | |
| u | 13755 | 10.4% |
| i | 11004 | 8.3% |
| e | 8253 | 6.2% |
| n | 8253 | 6.2% |
| p | 5502 | 4.2% |
| a | 5502 | 4.2% |
| - | 5502 | 4.2% |
| o | 5502 | 4.2% |
| Other values (11) | 35763 |
coordinate_source
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 188.1 KiB |
| JHB_Aurum_009 |
|---|
Length
| Max length | 13 |
|---|---|
| Median length | 13 |
| Mean length | 13 |
| Min length | 13 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | JHB_Aurum_009 |
|---|---|
| 2nd row | JHB_Aurum_009 |
| 3rd row | JHB_Aurum_009 |
| 4th row | JHB_Aurum_009 |
| 5th row | JHB_Aurum_009 |
Common Values
| Value | Count | Frequency (%) |
| JHB_Aurum_009 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| jhb_aurum_009 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| _ | 5502 | |
| u | 5502 | |
| 0 | 5502 | |
| J | 2751 | |
| H | 2751 | |
| B | 2751 | |
| A | 2751 | |
| r | 2751 | |
| m | 2751 | |
| 9 | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11004 | |
| Uppercase Letter | 11004 | |
| Decimal Number | 8253 | |
| Connector Punctuation | 5502 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 2751 | |
| H | 2751 | |
| B | 2751 | |
| A | 2751 |
Lowercase Letter
| Value | Count | Frequency (%) |
| u | 5502 | |
| r | 2751 | |
| m | 2751 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| 9 | 2751 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 5502 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 22008 | |
| Common | 13755 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| u | 5502 | |
| J | 2751 | |
| H | 2751 | |
| B | 2751 | |
| A | 2751 | |
| r | 2751 | |
| m | 2751 |
Common
| Value | Count | Frequency (%) |
| _ | 5502 | |
| 0 | 5502 | |
| 9 | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 35763 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| _ | 5502 | |
| u | 5502 | |
| 0 | 5502 | |
| J | 2751 | |
| H | 2751 | |
| B | 2751 | |
| A | 2751 | |
| r | 2751 | |
| m | 2751 | |
| 9 | 2751 |
coordinate_precision
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 163.9 KiB |
| high |
|---|
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | high |
|---|---|
| 2nd row | high |
| 3rd row | high |
| 4th row | high |
| 5th row | high |
Common Values
| Value | Count | Frequency (%) |
| high | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| high | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| h | 5502 | |
| i | 2751 | |
| g | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 11004 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| h | 5502 | |
| i | 2751 | |
| g | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11004 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| h | 5502 | |
| i | 2751 | |
| g | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11004 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| h | 5502 | |
| i | 2751 | |
| g | 2751 |
geographic_source
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 204.2 KiB |
| harmonized_datasets |
|---|
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | harmonized_datasets |
|---|---|
| 2nd row | harmonized_datasets |
| 3rd row | harmonized_datasets |
| 4th row | harmonized_datasets |
| 5th row | harmonized_datasets |
Common Values
| Value | Count | Frequency (%) |
| harmonized_datasets | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| harmonized_datasets | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 8253 | |
| e | 5502 | |
| d | 5502 | |
| t | 5502 | |
| s | 5502 | |
| h | 2751 | 5.3% |
| r | 2751 | 5.3% |
| m | 2751 | 5.3% |
| o | 2751 | 5.3% |
| n | 2751 | 5.3% |
| Other values (3) | 8253 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 49518 | |
| Connector Punctuation | 2751 | 5.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 8253 | |
| e | 5502 | |
| d | 5502 | |
| t | 5502 | |
| s | 5502 | |
| h | 2751 | 5.6% |
| r | 2751 | 5.6% |
| m | 2751 | 5.6% |
| o | 2751 | 5.6% |
| n | 2751 | 5.6% |
| Other values (2) | 5502 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 49518 | |
| Common | 2751 | 5.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 8253 | |
| e | 5502 | |
| d | 5502 | |
| t | 5502 | |
| s | 5502 | |
| h | 2751 | 5.6% |
| r | 2751 | 5.6% |
| m | 2751 | 5.6% |
| o | 2751 | 5.6% |
| n | 2751 | 5.6% |
| Other values (2) | 5502 |
Common
| Value | Count | Frequency (%) |
| _ | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 52269 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 8253 | |
| e | 5502 | |
| d | 5502 | |
| t | 5502 | |
| s | 5502 | |
| h | 2751 | 5.3% |
| r | 2751 | 5.3% |
| m | 2751 | 5.3% |
| o | 2751 | 5.3% |
| n | 2751 | 5.3% |
| Other values (3) | 8253 |
HIV_status
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 174.6 KiB |
| Positive |
|---|
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 8 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Positive |
|---|---|
| 2nd row | Positive |
| 3rd row | Positive |
| 4th row | Positive |
| 5th row | Positive |
Common Values
| Value | Count | Frequency (%) |
| Positive | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| positive | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 5502 | |
| P | 2751 | |
| o | 2751 | |
| s | 2751 | |
| t | 2751 | |
| v | 2751 | |
| e | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 19257 | |
| Uppercase Letter | 2751 | 12.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 5502 | |
| o | 2751 | |
| s | 2751 | |
| t | 2751 | |
| v | 2751 | |
| e | 2751 |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 22008 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 5502 | |
| P | 2751 | |
| o | 2751 | |
| s | 2751 | |
| t | 2751 | |
| v | 2751 | |
| e | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 22008 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 5502 | |
| P | 2751 | |
| o | 2751 | |
| s | 2751 | |
| t | 2751 | |
| v | 2751 | |
| e | 2751 |
johannesburg_metro_valid
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.2 KiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5502 | |
| Other Punctuation | 2751 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| 0 | 2751 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8253 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
study_site_location
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 247.2 KiB |
| Tembisa/East Rand (Aurum Institute) |
|---|
Length
| Max length | 35 |
|---|---|
| Median length | 35 |
| Mean length | 35 |
| Min length | 35 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Tembisa/East Rand (Aurum Institute) |
|---|---|
| 2nd row | Tembisa/East Rand (Aurum Institute) |
| 3rd row | Tembisa/East Rand (Aurum Institute) |
| 4th row | Tembisa/East Rand (Aurum Institute) |
| 5th row | Tembisa/East Rand (Aurum Institute) |
Common Values
| Value | Count | Frequency (%) |
| Tembisa/East Rand (Aurum Institute) | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| tembisa/east | 2751 | |
| rand | 2751 | |
| aurum | 2751 | |
| institute | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 11004 | 11.4% |
| 8253 | 8.6% | |
| s | 8253 | 8.6% |
| a | 8253 | 8.6% |
| u | 8253 | 8.6% |
| m | 5502 | 5.7% |
| i | 5502 | 5.7% |
| e | 5502 | 5.7% |
| n | 5502 | 5.7% |
| ( | 2751 | 2.9% |
| Other values (10) | 27510 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 66024 | |
| Uppercase Letter | 13755 | 14.3% |
| Space Separator | 8253 | 8.6% |
| Open Punctuation | 2751 | 2.9% |
| Other Punctuation | 2751 | 2.9% |
| Close Punctuation | 2751 | 2.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 11004 | |
| s | 8253 | |
| a | 8253 | |
| u | 8253 | |
| m | 5502 | |
| i | 5502 | |
| e | 5502 | |
| n | 5502 | |
| r | 2751 | 4.2% |
| d | 2751 | 4.2% |
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 2751 | |
| A | 2751 | |
| T | 2751 | |
| R | 2751 | |
| E | 2751 |
Space Separator
| Value | Count | Frequency (%) |
| 8253 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 2751 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 2751 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 79779 | |
| Common | 16506 | 17.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 11004 | |
| s | 8253 | |
| a | 8253 | |
| u | 8253 | |
| m | 5502 | 6.9% |
| i | 5502 | 6.9% |
| e | 5502 | 6.9% |
| n | 5502 | 6.9% |
| I | 2751 | 3.4% |
| r | 2751 | 3.4% |
| Other values (6) | 16506 |
Common
| Value | Count | Frequency (%) |
| 8253 | ||
| ( | 2751 | 16.7% |
| / | 2751 | 16.7% |
| ) | 2751 | 16.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 96285 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 11004 | 11.4% |
| 8253 | 8.6% | |
| s | 8253 | 8.6% |
| a | 8253 | 8.6% |
| u | 8253 | 8.6% |
| m | 5502 | 5.7% |
| i | 5502 | 5.7% |
| e | 5502 | 5.7% |
| n | 5502 | 5.7% |
| ( | 2751 | 2.9% |
| Other values (10) | 27510 |
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.451807 |
| Minimum | 9.356 |
|---|---|
| Maximum | 23.589 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 9.356 |
|---|---|
| 5-th percentile | 9.356 |
| Q1 | 13.213 |
| median | 14.195 |
| Q3 | 19.293 |
| 95-th percentile | 23.589 |
| Maximum | 23.589 |
| Range | 14.233 |
| Interquartile range (IQR) | 6.08 |
Descriptive statistics
| Standard deviation | 3.5385321 |
|---|---|
| Coefficient of variation (CV) | 0.22900442 |
| Kurtosis | -0.30036519 |
| Mean | 15.451807 |
| Median Absolute Deviation (MAD) | 0.982 |
| Skewness | 0.47348153 |
| Sum | 17537.801 |
| Variance | 12.521209 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 19.293 | 214 | 7.8% |
| 13.213 | 208 | 7.6% |
| 14.195 | 187 | 6.8% |
| 13.868 | 144 | 5.2% |
| 9.356 | 98 | 3.6% |
| 18.203 | 67 | 2.4% |
| 23.589 | 62 | 2.3% |
| 13.656 | 53 | 1.9% |
| 13.316 | 41 | 1.5% |
| 17.799 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 9.356 | 98 | |
| 13.213 | 208 | |
| 13.316 | 41 | 1.5% |
| 13.656 | 53 | 1.9% |
| 13.868 | 144 | |
| 14.195 | 187 | |
| 17.799 | 39 | 1.4% |
| 18.203 | 67 | 2.4% |
| 19.293 | 214 | |
| 20.293 | 22 | 0.8% |
| Value | Count | Frequency (%) |
| 23.589 | 62 | 2.3% |
| 20.293 | 22 | 0.8% |
| 19.293 | 214 | |
| 18.203 | 67 | 2.4% |
| 17.799 | 39 | 1.4% |
| 14.195 | 187 | |
| 13.868 | 144 | |
| 13.656 | 53 | 1.9% |
| 13.316 | 41 | 1.5% |
| 13.213 | 208 |
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.182599 |
| Minimum | 17.553 |
|---|---|
| Maximum | 30.083 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 17.553 |
|---|---|
| 5-th percentile | 17.553 |
| Q1 | 21.474 |
| median | 22.413 |
| Q3 | 26.343 |
| 95-th percentile | 30.083 |
| Maximum | 30.083 |
| Range | 12.53 |
| Interquartile range (IQR) | 4.869 |
Descriptive statistics
| Standard deviation | 2.9483779 |
|---|---|
| Coefficient of variation (CV) | 0.12718065 |
| Kurtosis | 0.15361931 |
| Mean | 23.182599 |
| Median Absolute Deviation (MAD) | 1.066 |
| Skewness | 0.324421 |
| Sum | 26312.25 |
| Variance | 8.6929324 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 26.343 | 214 | 7.8% |
| 22.23 | 208 | 7.6% |
| 23.023 | 187 | 6.8% |
| 21.347 | 144 | 5.2% |
| 17.553 | 98 | 3.6% |
| 22.413 | 67 | 2.4% |
| 30.083 | 62 | 2.3% |
| 21.474 | 53 | 1.9% |
| 20.768 | 41 | 1.5% |
| 25.8 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 17.553 | 98 | |
| 20.768 | 41 | 1.5% |
| 21.347 | 144 | |
| 21.474 | 53 | 1.9% |
| 22.23 | 208 | |
| 22.413 | 67 | 2.4% |
| 23.023 | 187 | |
| 25.8 | 39 | 1.4% |
| 26.343 | 214 | |
| 26.769 | 22 | 0.8% |
| Value | Count | Frequency (%) |
| 30.083 | 62 | 2.3% |
| 26.769 | 22 | 0.8% |
| 26.343 | 214 | |
| 25.8 | 39 | 1.4% |
| 23.023 | 187 | |
| 22.413 | 67 | 2.4% |
| 22.23 | 208 | |
| 21.474 | 53 | 1.9% |
| 21.347 | 144 | |
| 20.768 | 41 | 1.5% |
climate_daily_min_temp
Real number (ℝ)
High correlation Missing
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.5503286 |
| Minimum | 2.343 |
|---|---|
| Maximum | 14.954 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 2.343 |
|---|---|
| 5-th percentile | 2.343 |
| Q1 | 3.763 |
| median | 6.616 |
| Q3 | 11.253 |
| 95-th percentile | 14.954 |
| Maximum | 14.954 |
| Range | 12.611 |
| Interquartile range (IQR) | 7.49 |
Descriptive statistics
| Standard deviation | 4.0456474 |
|---|---|
| Coefficient of variation (CV) | 0.53582401 |
| Kurtosis | -1.0855077 |
| Mean | 7.5503286 |
| Median Absolute Deviation (MAD) | 2.853 |
| Skewness | 0.50562955 |
| Sum | 8569.623 |
| Variance | 16.367263 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11.253 | 214 | 7.8% |
| 3.763 | 208 | 7.6% |
| 4.56 | 187 | 6.8% |
| 7.436 | 144 | 5.2% |
| 2.343 | 98 | 3.6% |
| 14.79 | 67 | 2.4% |
| 14.954 | 62 | 2.3% |
| 6.034 | 53 | 1.9% |
| 6.616 | 41 | 1.5% |
| 10.493 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 2.343 | 98 | |
| 3.763 | 208 | |
| 4.56 | 187 | |
| 6.034 | 53 | 1.9% |
| 6.616 | 41 | 1.5% |
| 7.436 | 144 | |
| 10.493 | 39 | 1.4% |
| 11.253 | 214 | |
| 13.968 | 22 | 0.8% |
| 14.79 | 67 | 2.4% |
| Value | Count | Frequency (%) |
| 14.954 | 62 | 2.3% |
| 14.79 | 67 | 2.4% |
| 13.968 | 22 | 0.8% |
| 11.253 | 214 | |
| 10.493 | 39 | 1.4% |
| 7.436 | 144 | |
| 6.616 | 41 | 1.5% |
| 6.034 | 53 | 1.9% |
| 4.56 | 187 | |
| 3.763 | 208 |
climate_7d_mean_temp
Real number (ℝ)
High correlation Missing
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.139061 |
| Minimum | 9.215 |
|---|---|
| Maximum | 21.742 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 9.215 |
|---|---|
| 5-th percentile | 9.215 |
| Q1 | 11.927 |
| median | 16.313 |
| Q3 | 19.038 |
| 95-th percentile | 21.742 |
| Maximum | 21.742 |
| Range | 12.527 |
| Interquartile range (IQR) | 7.111 |
Descriptive statistics
| Standard deviation | 3.6217705 |
|---|---|
| Coefficient of variation (CV) | 0.2392335 |
| Kurtosis | -1.2015134 |
| Mean | 15.139061 |
| Median Absolute Deviation (MAD) | 3.532 |
| Skewness | 0.034902052 |
| Sum | 17182.834 |
| Variance | 13.117222 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 19.038 | 214 | 7.8% |
| 16.313 | 208 | 7.6% |
| 11.927 | 187 | 6.8% |
| 12.781 | 144 | 5.2% |
| 9.215 | 98 | 3.6% |
| 18.254 | 67 | 2.4% |
| 21.742 | 62 | 2.3% |
| 10.793 | 53 | 1.9% |
| 12.665 | 41 | 1.5% |
| 16.471 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 9.215 | 98 | |
| 10.793 | 53 | 1.9% |
| 11.927 | 187 | |
| 12.665 | 41 | 1.5% |
| 12.781 | 144 | |
| 16.313 | 208 | |
| 16.471 | 39 | 1.4% |
| 18.254 | 67 | 2.4% |
| 19.038 | 214 | |
| 19.865 | 22 | 0.8% |
| Value | Count | Frequency (%) |
| 21.742 | 62 | 2.3% |
| 19.865 | 22 | 0.8% |
| 19.038 | 214 | |
| 18.254 | 67 | 2.4% |
| 16.471 | 39 | 1.4% |
| 16.313 | 208 | |
| 12.781 | 144 | |
| 12.665 | 41 | 1.5% |
| 11.927 | 187 | |
| 10.793 | 53 | 1.9% |
climate_7d_max_temp
Real number (ℝ)
High correlation Missing
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25.916914 |
| Minimum | 17.721 |
|---|---|
| Maximum | 30.867 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 17.721 |
|---|---|
| 5-th percentile | 17.721 |
| Q1 | 21.977 |
| median | 26.996 |
| Q3 | 29.423 |
| 95-th percentile | 30.867 |
| Maximum | 30.867 |
| Range | 13.146 |
| Interquartile range (IQR) | 7.446 |
Descriptive statistics
| Standard deviation | 4.0747905 |
|---|---|
| Coefficient of variation (CV) | 0.15722514 |
| Kurtosis | -0.91203039 |
| Mean | 25.916914 |
| Median Absolute Deviation (MAD) | 2.708 |
| Skewness | -0.60654343 |
| Sum | 29415.697 |
| Variance | 16.603917 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 29.704 | 214 | 7.8% |
| 29.423 | 208 | 7.6% |
| 25.079 | 187 | 6.8% |
| 21.52 | 144 | 5.2% |
| 17.721 | 98 | 3.6% |
| 26.996 | 67 | 2.4% |
| 30.867 | 62 | 2.3% |
| 21.977 | 53 | 1.9% |
| 20.768 | 41 | 1.5% |
| 26.761 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 17.721 | 98 | |
| 20.768 | 41 | 1.5% |
| 21.52 | 144 | |
| 21.977 | 53 | 1.9% |
| 25.079 | 187 | |
| 26.761 | 39 | 1.4% |
| 26.996 | 67 | 2.4% |
| 28.696 | 22 | 0.8% |
| 29.423 | 208 | |
| 29.704 | 214 |
| Value | Count | Frequency (%) |
| 30.867 | 62 | 2.3% |
| 29.704 | 214 | |
| 29.423 | 208 | |
| 28.696 | 22 | 0.8% |
| 26.996 | 67 | 2.4% |
| 26.761 | 39 | 1.4% |
| 25.079 | 187 | |
| 21.977 | 53 | 1.9% |
| 21.52 | 144 | |
| 20.768 | 41 | 1.5% |
climate_14d_mean_temp
Real number (ℝ)
High correlation Missing
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16.042067 |
| Minimum | 10.426 |
|---|---|
| Maximum | 21.69 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 10.426 |
|---|---|
| 5-th percentile | 10.426 |
| Q1 | 12.258 |
| median | 18.254 |
| Q3 | 19.069 |
| 95-th percentile | 21.69 |
| Maximum | 21.69 |
| Range | 11.264 |
| Interquartile range (IQR) | 6.811 |
Descriptive statistics
| Standard deviation | 3.3876747 |
|---|---|
| Coefficient of variation (CV) | 0.21117446 |
| Kurtosis | -1.3067725 |
| Mean | 16.042067 |
| Median Absolute Deviation (MAD) | 3.436 |
| Skewness | -0.22262646 |
| Sum | 18207.746 |
| Variance | 11.47634 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 19.069 | 214 | 7.8% |
| 18.483 | 208 | 7.6% |
| 14.595 | 187 | 6.8% |
| 12.258 | 144 | 5.2% |
| 10.426 | 98 | 3.6% |
| 18.254 | 67 | 2.4% |
| 21.69 | 62 | 2.3% |
| 11.532 | 53 | 1.9% |
| 12.57 | 41 | 1.5% |
| 16.057 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 10.426 | 98 | |
| 11.532 | 53 | 1.9% |
| 12.258 | 144 | |
| 12.57 | 41 | 1.5% |
| 14.595 | 187 | |
| 16.057 | 39 | 1.4% |
| 18.254 | 67 | 2.4% |
| 18.483 | 208 | |
| 19.069 | 214 | |
| 20.262 | 22 | 0.8% |
| Value | Count | Frequency (%) |
| 21.69 | 62 | 2.3% |
| 20.262 | 22 | 0.8% |
| 19.069 | 214 | |
| 18.483 | 208 | |
| 18.254 | 67 | 2.4% |
| 16.057 | 39 | 1.4% |
| 14.595 | 187 | |
| 12.57 | 41 | 1.5% |
| 12.258 | 144 | |
| 11.532 | 53 | 1.9% |
climate_30d_mean_temp
Real number (ℝ)
High correlation Missing
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16.024641 |
| Minimum | 10.635 |
|---|---|
| Maximum | 21.041 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 10.635 |
|---|---|
| 5-th percentile | 10.635 |
| Q1 | 11.635 |
| median | 18.576 |
| Q3 | 18.854 |
| 95-th percentile | 21.041 |
| Maximum | 21.041 |
| Range | 10.406 |
| Interquartile range (IQR) | 7.219 |
Descriptive statistics
| Standard deviation | 3.4391822 |
|---|---|
| Coefficient of variation (CV) | 0.21461837 |
| Kurtosis | -1.3241931 |
| Mean | 16.024641 |
| Median Absolute Deviation (MAD) | 2.465 |
| Skewness | -0.42123309 |
| Sum | 18187.967 |
| Variance | 11.827974 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 18.854 | 214 | 7.8% |
| 18.576 | 208 | 7.6% |
| 15.421 | 187 | 6.8% |
| 11.076 | 144 | 5.2% |
| 10.635 | 98 | 3.6% |
| 18.794 | 67 | 2.4% |
| 21.041 | 62 | 2.3% |
| 11.635 | 53 | 1.9% |
| 12.856 | 41 | 1.5% |
| 15.775 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 10.635 | 98 | |
| 11.076 | 144 | |
| 11.635 | 53 | 1.9% |
| 12.856 | 41 | 1.5% |
| 15.421 | 187 | |
| 15.775 | 39 | 1.4% |
| 18.576 | 208 | |
| 18.794 | 67 | 2.4% |
| 18.854 | 214 | |
| 20.263 | 22 | 0.8% |
| Value | Count | Frequency (%) |
| 21.041 | 62 | 2.3% |
| 20.263 | 22 | 0.8% |
| 18.854 | 214 | |
| 18.794 | 67 | 2.4% |
| 18.576 | 208 | |
| 15.775 | 39 | 1.4% |
| 15.421 | 187 | |
| 12.856 | 41 | 1.5% |
| 11.635 | 53 | 1.9% |
| 11.076 | 144 |
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.1579163 |
| Minimum | 3.618 |
|---|---|
| Maximum | 10.271 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 3.618 |
|---|---|
| 5-th percentile | 3.618 |
| Q1 | 6.505 |
| median | 7.489 |
| Q3 | 9.042 |
| 95-th percentile | 10.271 |
| Maximum | 10.271 |
| Range | 6.653 |
| Interquartile range (IQR) | 2.537 |
Descriptive statistics
| Standard deviation | 2.2633511 |
|---|---|
| Coefficient of variation (CV) | 0.31620252 |
| Kurtosis | -0.9276673 |
| Mean | 7.1579163 |
| Median Absolute Deviation (MAD) | 1.553 |
| Skewness | -0.39512055 |
| Sum | 8124.235 |
| Variance | 5.1227584 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7.489 | 214 | 7.8% |
| 3.654 | 208 | 7.6% |
| 7.602 | 187 | 6.8% |
| 10.271 | 144 | 5.2% |
| 6.918 | 98 | 3.6% |
| 3.618 | 67 | 2.4% |
| 9.042 | 62 | 2.3% |
| 9.839 | 53 | 1.9% |
| 7.913 | 41 | 1.5% |
| 10.025 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 3.618 | 67 | 2.4% |
| 3.654 | 208 | |
| 6.505 | 22 | 0.8% |
| 6.918 | 98 | |
| 7.489 | 214 | |
| 7.602 | 187 | |
| 7.913 | 41 | 1.5% |
| 9.042 | 62 | 2.3% |
| 9.839 | 53 | 1.9% |
| 10.025 | 39 | 1.4% |
| Value | Count | Frequency (%) |
| 10.271 | 144 | |
| 10.025 | 39 | 1.4% |
| 9.839 | 53 | 1.9% |
| 9.042 | 62 | 2.3% |
| 7.913 | 41 | 1.5% |
| 7.602 | 187 | |
| 7.489 | 214 | |
| 6.918 | 98 | |
| 6.505 | 22 | 0.8% |
| 3.654 | 208 |
climate_standardized_anomaly
Real number (ℝ)
High correlation Missing
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.19795242 |
| Minimum | -1.853 |
|---|---|
| Maximum | 1.905 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 560 |
| Negative (%) | 20.4% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | -1.853 |
|---|---|
| 5-th percentile | -1.853 |
| Q1 | -1.189 |
| median | 0.007 |
| Q3 | 1.074 |
| 95-th percentile | 1.905 |
| Maximum | 1.905 |
| Range | 3.758 |
| Interquartile range (IQR) | 2.263 |
Descriptive statistics
| Standard deviation | 1.3126605 |
|---|---|
| Coefficient of variation (CV) | -6.6311919 |
| Kurtosis | -1.2451726 |
| Mean | -0.19795242 |
| Median Absolute Deviation (MAD) | 1.099 |
| Skewness | 0.36631188 |
| Sum | -224.676 |
| Variance | 1.7230776 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.007 | 214 | 7.8% |
| -1.853 | 208 | 7.6% |
| -1.092 | 187 | 6.8% |
| 1.781 | 144 | 5.2% |
| -1.189 | 98 | 3.6% |
| -0.752 | 67 | 2.4% |
| 1.905 | 62 | 2.3% |
| 1.604 | 53 | 1.9% |
| 0.19 | 41 | 1.5% |
| 1.074 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| -1.853 | 208 | |
| -1.189 | 98 | |
| -1.092 | 187 | |
| -0.752 | 67 | 2.4% |
| 0.007 | 214 | |
| 0.19 | 41 | 1.5% |
| 0.959 | 22 | 0.8% |
| 1.074 | 39 | 1.4% |
| 1.604 | 53 | 1.9% |
| 1.781 | 144 |
| Value | Count | Frequency (%) |
| 1.905 | 62 | 2.3% |
| 1.781 | 144 | |
| 1.604 | 53 | 1.9% |
| 1.074 | 39 | 1.4% |
| 0.959 | 22 | 0.8% |
| 0.19 | 41 | 1.5% |
| 0.007 | 214 | |
| -0.752 | 67 | 2.4% |
| -1.092 | 187 | |
| -1.189 | 98 |
climate_heat_day_p90
Categorical
High correlation Imbalance Missing
Heat day indicator (>90th percentile)
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Memory size | 167.5 KiB |
| 0.0 | |
|---|---|
| 1.0 | 62 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 1073 | |
| 1.0 | 62 | 2.3% |
| (Missing) | 1616 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 1073 | |
| 1.0 | 62 | 5.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2208 | |
| . | 1135 | |
| 1 | 62 | 1.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2270 | |
| Other Punctuation | 1135 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 2208 | |
| 1 | 62 | 2.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1135 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3405 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 2208 | |
| . | 1135 | |
| 1 | 62 | 1.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3405 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 2208 | |
| . | 1135 | |
| 1 | 62 | 1.8% |
climate_heat_day_p95
Categorical
High correlation Imbalance Missing
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Memory size | 167.5 KiB |
| 0.0 | |
|---|---|
| 1.0 | 62 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 1073 | |
| 1.0 | 62 | 2.3% |
| (Missing) | 1616 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 1073 | |
| 1.0 | 62 | 5.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2208 | |
| . | 1135 | |
| 1 | 62 | 1.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2270 | |
| Other Punctuation | 1135 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 2208 | |
| 1 | 62 | 2.7% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1135 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3405 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 2208 | |
| . | 1135 | |
| 1 | 62 | 1.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3405 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 2208 | |
| . | 1135 | |
| 1 | 62 | 1.8% |
| Distinct | 11 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.312848 |
| Minimum | 13.428 |
|---|---|
| Maximum | 27.393 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.0 KiB |
Quantile statistics
| Minimum | 13.428 |
|---|---|
| 5-th percentile | 13.639 |
| Q1 | 14.306 |
| median | 17.923 |
| Q3 | 21.523 |
| 95-th percentile | 27.393 |
| Maximum | 27.393 |
| Range | 13.965 |
| Interquartile range (IQR) | 7.217 |
Descriptive statistics
| Standard deviation | 3.536553 |
|---|---|
| Coefficient of variation (CV) | 0.19311867 |
| Kurtosis | 0.20907555 |
| Mean | 18.312848 |
| Median Absolute Deviation (MAD) | 3.6 |
| Skewness | 0.58250383 |
| Sum | 20785.083 |
| Variance | 12.507207 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 21.523 | 214 | 7.8% |
| 19.275 | 208 | 7.6% |
| 17.347 | 187 | 6.8% |
| 14.306 | 144 | 5.2% |
| 13.639 | 98 | 3.6% |
| 17.923 | 67 | 2.4% |
| 27.393 | 62 | 2.3% |
| 13.428 | 53 | 1.9% |
| 15.721 | 41 | 1.5% |
| 19.958 | 39 | 1.4% |
| (Missing) | 1616 |
| Value | Count | Frequency (%) |
| 13.428 | 53 | 1.9% |
| 13.639 | 98 | |
| 14.306 | 144 | |
| 15.721 | 41 | 1.5% |
| 17.347 | 187 | |
| 17.923 | 67 | 2.4% |
| 19.275 | 208 | |
| 19.958 | 39 | 1.4% |
| 21.523 | 214 | |
| 22.526 | 22 | 0.8% |
| Value | Count | Frequency (%) |
| 27.393 | 62 | 2.3% |
| 22.526 | 22 | 0.8% |
| 21.523 | 214 | |
| 19.958 | 39 | 1.4% |
| 19.275 | 208 | |
| 17.923 | 67 | 2.4% |
| 17.347 | 187 | |
| 15.721 | 41 | 1.5% |
| 14.306 | 144 | |
| 13.639 | 98 |
climate_p90_threshold
Categorical
Constant Missing
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Memory size | 170.8 KiB |
| 28.409 |
|---|
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 28.409 |
|---|---|
| 2nd row | 28.409 |
| 3rd row | 28.409 |
| 4th row | 28.409 |
| 5th row | 28.409 |
Common Values
| Value | Count | Frequency (%) |
| 28.409 | 1135 | |
| (Missing) | 1616 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 28.409 | 1135 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1135 | |
| 8 | 1135 | |
| . | 1135 | |
| 4 | 1135 | |
| 0 | 1135 | |
| 9 | 1135 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5675 | |
| Other Punctuation | 1135 | 16.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1135 | |
| 8 | 1135 | |
| 4 | 1135 | |
| 0 | 1135 | |
| 9 | 1135 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1135 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6810 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1135 | |
| 8 | 1135 | |
| . | 1135 | |
| 4 | 1135 | |
| 0 | 1135 | |
| 9 | 1135 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6810 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1135 | |
| 8 | 1135 | |
| . | 1135 | |
| 4 | 1135 | |
| 0 | 1135 | |
| 9 | 1135 |
climate_p95_threshold
Categorical
Constant Missing
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Memory size | 170.8 KiB |
| 29.704 |
|---|
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 29.704 |
|---|---|
| 2nd row | 29.704 |
| 3rd row | 29.704 |
| 4th row | 29.704 |
| 5th row | 29.704 |
Common Values
| Value | Count | Frequency (%) |
| 29.704 | 1135 | |
| (Missing) | 1616 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 29.704 | 1135 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1135 | |
| 9 | 1135 | |
| . | 1135 | |
| 7 | 1135 | |
| 0 | 1135 | |
| 4 | 1135 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5675 | |
| Other Punctuation | 1135 | 16.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1135 | |
| 9 | 1135 | |
| 7 | 1135 | |
| 0 | 1135 | |
| 4 | 1135 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1135 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6810 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1135 | |
| 9 | 1135 | |
| . | 1135 | |
| 7 | 1135 | |
| 0 | 1135 | |
| 4 | 1135 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6810 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1135 | |
| 9 | 1135 | |
| . | 1135 | |
| 7 | 1135 | |
| 0 | 1135 | |
| 4 | 1135 |
climate_p99_threshold
Categorical
Constant Missing
| Distinct | 1 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Memory size | 170.8 KiB |
| 31.797 |
|---|
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 31.797 |
|---|---|
| 2nd row | 31.797 |
| 3rd row | 31.797 |
| 4th row | 31.797 |
| 5th row | 31.797 |
Common Values
| Value | Count | Frequency (%) |
| 31.797 | 1135 | |
| (Missing) | 1616 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 31.797 | 1135 |
Most occurring characters
| Value | Count | Frequency (%) |
| 7 | 2270 | |
| 3 | 1135 | |
| 1 | 1135 | |
| . | 1135 | |
| 9 | 1135 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5675 | |
| Other Punctuation | 1135 | 16.7% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 7 | 2270 | |
| 3 | 1135 | |
| 1 | 1135 | |
| 9 | 1135 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1135 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 6810 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 7 | 2270 | |
| 3 | 1135 | |
| 1 | 1135 | |
| . | 1135 | |
| 9 | 1135 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6810 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 7 | 2270 | |
| 3 | 1135 | |
| 1 | 1135 | |
| . | 1135 | |
| 9 | 1135 |
climate_season
Categorical
High correlation Missing
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 1616 |
| Missing (%) | 58.7% |
| Memory size | 170.8 KiB |
| Spring | |
|---|---|
| Winter | |
| Summer | |
| Autumn |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Autumn |
|---|---|
| 2nd row | Spring |
| 3rd row | Winter |
| 4th row | Spring |
| 5th row | Spring |
Common Values
| Value | Count | Frequency (%) |
| Spring | 609 | 22.1% |
| Winter | 295 | 10.7% |
| Summer | 129 | 4.7% |
| Autumn | 102 | 3.7% |
| (Missing) | 1616 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| spring | 609 | |
| winter | 295 | |
| summer | 129 | 11.4% |
| autumn | 102 | 9.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 1033 | |
| n | 1006 | |
| i | 904 | |
| S | 738 | |
| p | 609 | |
| g | 609 | |
| e | 424 | |
| t | 397 | 5.8% |
| m | 360 | 5.3% |
| u | 333 | 4.9% |
| Other values (2) | 397 | 5.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5675 | |
| Uppercase Letter | 1135 | 16.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 1033 | |
| n | 1006 | |
| i | 904 | |
| p | 609 | |
| g | 609 | |
| e | 424 | |
| t | 397 | 7.0% |
| m | 360 | 6.3% |
| u | 333 | 5.9% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 738 | |
| W | 295 | 26.0% |
| A | 102 | 9.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6810 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 1033 | |
| n | 1006 | |
| i | 904 | |
| S | 738 | |
| p | 609 | |
| g | 609 | |
| e | 424 | |
| t | 397 | 5.8% |
| m | 360 | 5.3% |
| u | 333 | 4.9% |
| Other values (2) | 397 | 5.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6810 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 1033 | |
| n | 1006 | |
| i | 904 | |
| S | 738 | |
| p | 609 | |
| g | 609 | |
| e | 424 | |
| t | 397 | 5.8% |
| m | 360 | 5.3% |
| u | 333 | 4.9% |
| Other values (2) | 397 | 5.8% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.2 KiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5502 | |
| Other Punctuation | 2751 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| 0 | 2751 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8253 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
cd4_correction_applied
Categorical
High correlation Imbalance
Quality flag: CD4 missing codes removed
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.2 KiB |
| 0.0 | |
|---|---|
| 1.0 | 55 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 2696 | |
| 1.0 | 55 | 2.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 2696 | |
| 1.0 | 55 | 2.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5447 | |
| . | 2751 | |
| 1 | 55 | 0.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5502 | |
| Other Punctuation | 2751 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5447 | |
| 1 | 55 | 1.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8253 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5447 | |
| . | 2751 | |
| 1 | 55 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5447 | |
| . | 2751 | |
| 1 | 55 | 0.7% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.2 KiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5502 | |
| Other Punctuation | 2751 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| 0 | 2751 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8253 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
total_protein_extreme_flag
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.2 KiB |
| 0.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5502 | |
| Other Punctuation | 2751 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5502 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8253 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
dphru_053_final_corrections_applied
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.2 KiB |
| 0.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5502 | |
| Other Punctuation | 2751 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5502 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8253 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
ezin_002_final_corrections_applied
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.2 KiB |
| 0.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5502 | |
| Other Punctuation | 2751 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 5502 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8253 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 5502 | |
| . | 2751 |
quality_harmonization_version
Categorical
Constant
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 161.2 KiB |
| 2.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 2.0 |
| 3rd row | 2.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 2.0 | 2751 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2.0 | 2751 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 5502 | |
| Other Punctuation | 2751 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 2751 | |
| 0 | 2751 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 2751 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 8253 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8253 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 2751 | |
| . | 2751 | |
| 0 | 2751 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 24.2 KiB |
| False |
|---|
| Value | Count | Frequency (%) |
| False | 2751 |
Interactions
Correlations
| Age (at enrolment) | CD4 cell count (cells/µL) | HIV viral load (copies/mL) | Sex | cd4_correction_applied | climate_14d_mean_temp | climate_30d_mean_temp | climate_7d_max_temp | climate_7d_mean_temp | climate_daily_max_temp | climate_daily_mean_temp | climate_daily_min_temp | climate_heat_day_p90 | climate_heat_day_p95 | climate_heat_stress_index | climate_season | climate_standardized_anomaly | climate_temp_anomaly | month | season | year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age (at enrolment) | 1.000 | -0.130 | -0.088 | 0.200 | 0.054 | 0.027 | 0.028 | 0.023 | 0.033 | 0.012 | 0.021 | 0.028 | 0.052 | 0.052 | 0.025 | 0.041 | 0.005 | -0.020 | 0.019 | 0.050 | 0.044 |
| CD4 cell count (cells/µL) | -0.130 | 1.000 | 0.031 | 0.168 | 1.000 | 0.007 | 0.004 | -0.003 | -0.004 | 0.038 | 0.042 | 0.018 | 0.000 | 0.000 | 0.008 | 0.000 | 0.024 | 0.033 | -0.016 | 0.000 | 0.042 |
| HIV viral load (copies/mL) | -0.088 | 0.031 | 1.000 | 0.099 | 0.488 | 0.021 | 0.037 | 0.004 | 0.042 | 0.082 | 0.097 | 0.074 | 0.000 | 0.000 | 0.035 | 0.056 | 0.041 | -0.010 | -0.053 | 0.027 | 0.024 |
| Sex | 0.200 | 0.168 | 0.099 | 1.000 | 0.000 | 0.000 | 0.000 | 0.051 | 0.060 | 0.000 | 0.000 | 0.042 | 0.000 | 0.000 | 0.027 | 0.000 | 0.036 | 0.000 | 0.072 | 0.029 | 0.058 |
| cd4_correction_applied | 0.054 | 1.000 | 0.488 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.008 | 0.000 |
| climate_14d_mean_temp | 0.027 | 0.007 | 0.021 | 0.000 | 0.000 | 1.000 | 0.978 | 0.968 | 0.912 | 0.839 | 0.650 | 0.527 | 0.997 | 0.997 | 0.983 | 0.925 | 0.006 | -0.345 | 0.433 | 0.925 | 0.997 |
| climate_30d_mean_temp | 0.028 | 0.004 | 0.037 | 0.000 | 0.000 | 0.978 | 1.000 | 0.956 | 0.912 | 0.861 | 0.714 | 0.615 | 0.849 | 0.849 | 0.954 | 0.774 | 0.049 | -0.370 | 0.468 | 0.774 | 0.849 |
| climate_7d_max_temp | 0.023 | -0.003 | 0.004 | 0.051 | 0.000 | 0.968 | 0.956 | 1.000 | 0.871 | 0.825 | 0.614 | 0.486 | 0.417 | 0.417 | 0.941 | 0.793 | -0.022 | -0.336 | 0.496 | 0.793 | 0.417 |
| climate_7d_mean_temp | 0.033 | -0.004 | 0.042 | 0.060 | 0.000 | 0.912 | 0.912 | 0.871 | 1.000 | 0.736 | 0.707 | 0.736 | 0.998 | 0.998 | 0.916 | 0.664 | 0.278 | -0.206 | 0.351 | 0.664 | 0.998 |
| climate_daily_max_temp | 0.012 | 0.038 | 0.082 | 0.000 | 0.000 | 0.839 | 0.861 | 0.825 | 0.736 | 1.000 | 0.883 | 0.647 | 0.998 | 0.998 | 0.859 | 0.760 | 0.203 | -0.037 | 0.288 | 0.760 | 0.998 |
| climate_daily_mean_temp | 0.021 | 0.042 | 0.097 | 0.000 | 0.000 | 0.650 | 0.714 | 0.614 | 0.707 | 0.883 | 1.000 | 0.900 | 0.998 | 0.998 | 0.672 | 0.738 | 0.564 | 0.220 | 0.163 | 0.738 | 0.998 |
| climate_daily_min_temp | 0.028 | 0.018 | 0.074 | 0.042 | 0.000 | 0.527 | 0.615 | 0.486 | 0.736 | 0.647 | 0.900 | 1.000 | 0.609 | 0.609 | 0.537 | 0.941 | 0.744 | 0.266 | 0.076 | 0.941 | 0.609 |
| climate_heat_day_p90 | 0.052 | 0.000 | 0.000 | 0.000 | 0.000 | 0.997 | 0.849 | 0.417 | 0.998 | 0.998 | 0.998 | 0.609 | 1.000 | 0.991 | 0.997 | 0.670 | 0.436 | 0.998 | 0.996 | 0.670 | 0.991 |
| climate_heat_day_p95 | 0.052 | 0.000 | 0.000 | 0.000 | 0.000 | 0.997 | 0.849 | 0.417 | 0.998 | 0.998 | 0.998 | 0.609 | 0.991 | 1.000 | 0.997 | 0.670 | 0.436 | 0.998 | 0.996 | 0.670 | 0.991 |
| climate_heat_stress_index | 0.025 | 0.008 | 0.035 | 0.027 | 0.000 | 0.983 | 0.954 | 0.941 | 0.916 | 0.859 | 0.672 | 0.537 | 0.997 | 0.997 | 1.000 | 0.933 | 0.037 | -0.295 | 0.377 | 0.933 | 0.997 |
| climate_season | 0.041 | 0.000 | 0.056 | 0.000 | 0.000 | 0.925 | 0.774 | 0.793 | 0.664 | 0.760 | 0.738 | 0.941 | 0.670 | 0.670 | 0.933 | 1.000 | 0.817 | 0.785 | 0.914 | 1.000 | 0.670 |
| climate_standardized_anomaly | 0.005 | 0.024 | 0.041 | 0.036 | 0.000 | 0.006 | 0.049 | -0.022 | 0.278 | 0.203 | 0.564 | 0.744 | 0.436 | 0.436 | 0.037 | 0.817 | 1.000 | 0.782 | -0.497 | 0.817 | 0.436 |
| climate_temp_anomaly | -0.020 | 0.033 | -0.010 | 0.000 | 0.000 | -0.345 | -0.370 | -0.336 | -0.206 | -0.037 | 0.220 | 0.266 | 0.998 | 0.998 | -0.295 | 0.785 | 0.782 | 1.000 | -0.689 | 0.785 | 0.998 |
| month | 0.019 | -0.016 | -0.053 | 0.072 | 0.000 | 0.433 | 0.468 | 0.496 | 0.351 | 0.288 | 0.163 | 0.076 | 0.996 | 0.996 | 0.377 | 0.914 | -0.497 | -0.689 | 1.000 | 0.967 | 0.387 |
| season | 0.050 | 0.000 | 0.027 | 0.029 | 0.008 | 0.925 | 0.774 | 0.793 | 0.664 | 0.760 | 0.738 | 0.941 | 0.670 | 0.670 | 0.933 | 1.000 | 0.817 | 0.785 | 0.967 | 1.000 | 0.282 |
| year | 0.044 | 0.042 | 0.024 | 0.058 | 0.000 | 0.997 | 0.849 | 0.417 | 0.998 | 0.998 | 0.998 | 0.609 | 0.991 | 0.991 | 0.997 | 0.670 | 0.436 | 0.998 | 0.387 | 0.282 | 1.000 |
Missing values
Sample
| study_source | primary_date | year | month | season | latitude | longitude | jhb_subregion | city | province | country | Age (at enrolment) | Sex | CD4 cell count (cells/µL) | HIV viral load (copies/mL) | date | Country | Clinical Study ID | Location of study follow-up | coordinate_source | coordinate_precision | geographic_source | HIV_status | johannesburg_metro_valid | study_site_location | climate_daily_mean_temp | climate_daily_max_temp | climate_daily_min_temp | climate_7d_mean_temp | climate_7d_max_temp | climate_14d_mean_temp | climate_30d_mean_temp | climate_temp_anomaly | climate_standardized_anomaly | climate_heat_day_p90 | climate_heat_day_p95 | climate_heat_stress_index | climate_p90_threshold | climate_p95_threshold | climate_p99_threshold | climate_season | sa_biomarker_standards | cd4_correction_applied | final_comprehensive_fix_applied | total_protein_extreme_flag | dphru_053_final_corrections_applied | ezin_002_final_corrections_applied | quality_harmonization_version | waist_circ_unit_correction_applied | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3377 | JHB_Aurum_009 | 2014-02-15 | 2014.0 | 2.0 | Summer | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 24.0 | Female | 369.0 | 0.0 | 2014-02-15 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3378 | JHB_Aurum_009 | 2014-04-09 | 2014.0 | 4.0 | Autumn | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 38.0 | Female | 701.0 | NaN | 2014-04-09 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3379 | JHB_Aurum_009 | 2014-08-12 | 2014.0 | 8.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 21.0 | Male | 654.0 | NaN | 2014-08-12 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3380 | JHB_Aurum_009 | 2014-04-29 | 2014.0 | 4.0 | Autumn | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 29.0 | Male | 350.0 | NaN | 2014-04-29 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3381 | JHB_Aurum_009 | 2013-04-29 | 2013.0 | 4.0 | Autumn | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 35.0 | Female | 324.0 | 0.0 | 2013-04-29 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | 17.799 | 25.800 | 10.493 | 16.471 | 26.761 | 16.057 | 15.775 | 10.025 | 1.074 | 0.0 | 0.0 | 19.958 | 28.409 | 29.704 | 31.797 | Autumn | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3382 | JHB_Aurum_009 | 2014-06-26 | 2014.0 | 6.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 22.0 | Male | 276.0 | NaN | 2014-06-26 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3383 | JHB_Aurum_009 | 2013-11-19 | 2013.0 | 11.0 | Spring | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 38.0 | Female | NaN | NaN | 2013-11-19 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | 19.293 | 26.343 | 11.253 | 19.038 | 29.704 | 19.069 | 18.854 | 7.489 | 0.007 | 0.0 | 0.0 | 21.523 | 28.409 | 29.704 | 31.797 | Spring | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3384 | JHB_Aurum_009 | 2014-09-08 | 2014.0 | 9.0 | Spring | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | NaN | Male | NaN | NaN | 2014-09-08 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3385 | JHB_Aurum_009 | 2013-08-24 | 2013.0 | 8.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 22.0 | Female | 525.0 | NaN | 2013-08-24 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | 9.356 | 17.553 | 2.343 | 9.215 | 17.721 | 10.426 | 10.635 | 6.918 | -1.189 | 0.0 | 0.0 | 13.639 | 28.409 | 29.704 | 31.797 | Winter | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 3386 | JHB_Aurum_009 | 2014-03-24 | 2014.0 | 3.0 | Autumn | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 42.0 | Male | 287.0 | NaN | 2014-03-24 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| study_source | primary_date | year | month | season | latitude | longitude | jhb_subregion | city | province | country | Age (at enrolment) | Sex | CD4 cell count (cells/µL) | HIV viral load (copies/mL) | date | Country | Clinical Study ID | Location of study follow-up | coordinate_source | coordinate_precision | geographic_source | HIV_status | johannesburg_metro_valid | study_site_location | climate_daily_mean_temp | climate_daily_max_temp | climate_daily_min_temp | climate_7d_mean_temp | climate_7d_max_temp | climate_14d_mean_temp | climate_30d_mean_temp | climate_temp_anomaly | climate_standardized_anomaly | climate_heat_day_p90 | climate_heat_day_p95 | climate_heat_stress_index | climate_p90_threshold | climate_p95_threshold | climate_p99_threshold | climate_season | sa_biomarker_standards | cd4_correction_applied | final_comprehensive_fix_applied | total_protein_extreme_flag | dphru_053_final_corrections_applied | ezin_002_final_corrections_applied | quality_harmonization_version | waist_circ_unit_correction_applied | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6118 | JHB_Aurum_009 | 2013-07-17 | 2013.0 | 7.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 23.0 | Male | 174.0 | NaN | 2013-07-17 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | 13.868 | 21.347 | 7.436 | 12.781 | 21.520 | 12.258 | 11.076 | 10.271 | 1.781 | 0.0 | 0.0 | 14.306 | 28.409 | 29.704 | 31.797 | Winter | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6119 | JHB_Aurum_009 | 2013-06-06 | 2013.0 | 6.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 36.0 | Male | 110.0 | NaN | 2013-06-06 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | 13.656 | 21.474 | 6.034 | 10.793 | 21.977 | 11.532 | 11.635 | 9.839 | 1.604 | 0.0 | 0.0 | 13.428 | 28.409 | 29.704 | 31.797 | Winter | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6120 | JHB_Aurum_009 | 2014-06-17 | 2014.0 | 6.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 29.0 | Male | 393.0 | 0.0 | 2014-06-17 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6121 | JHB_Aurum_009 | 2014-02-03 | 2014.0 | 2.0 | Summer | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 34.0 | Female | 202.0 | NaN | 2014-02-03 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6122 | JHB_Aurum_009 | 2014-04-29 | 2014.0 | 4.0 | Autumn | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 34.0 | Female | 31.0 | NaN | 2014-04-29 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6123 | JHB_Aurum_009 | 2014-04-23 | 2014.0 | 4.0 | Autumn | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 31.0 | Male | 365.0 | NaN | 2014-04-23 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6124 | JHB_Aurum_009 | 2013-08-27 | 2013.0 | 8.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 31.0 | Female | 586.0 | NaN | 2013-08-27 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | 9.356 | 17.553 | 2.343 | 9.215 | 17.721 | 10.426 | 10.635 | 6.918 | -1.189 | 0.0 | 0.0 | 13.639 | 28.409 | 29.704 | 31.797 | Winter | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6125 | JHB_Aurum_009 | 2014-08-14 | 2014.0 | 8.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 65.0 | Male | 409.0 | NaN | 2014-08-14 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6126 | JHB_Aurum_009 | 2014-08-04 | 2014.0 | 8.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 28.0 | Male | 455.0 | NaN | 2014-08-04 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
| 6127 | JHB_Aurum_009 | 2013-11-16 | 2013.0 | 11.0 | Spring | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 23.0 | Male | 300.0 | NaN | 2013-11-16 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | 19.293 | 26.343 | 11.253 | 19.038 | 29.704 | 19.069 | 18.854 | 7.489 | 0.007 | 0.0 | 0.0 | 21.523 | 28.409 | 29.704 | 31.797 | Spring | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False |
Duplicate rows
Most frequently occurring
| study_source | primary_date | year | month | season | latitude | longitude | jhb_subregion | city | province | country | Age (at enrolment) | Sex | CD4 cell count (cells/µL) | HIV viral load (copies/mL) | date | Country | Clinical Study ID | Location of study follow-up | coordinate_source | coordinate_precision | geographic_source | HIV_status | johannesburg_metro_valid | study_site_location | climate_daily_mean_temp | climate_daily_max_temp | climate_daily_min_temp | climate_7d_mean_temp | climate_7d_max_temp | climate_14d_mean_temp | climate_30d_mean_temp | climate_temp_anomaly | climate_standardized_anomaly | climate_heat_day_p90 | climate_heat_day_p95 | climate_heat_stress_index | climate_p90_threshold | climate_p95_threshold | climate_p99_threshold | climate_season | sa_biomarker_standards | cd4_correction_applied | final_comprehensive_fix_applied | total_protein_extreme_flag | dphru_053_final_corrections_applied | ezin_002_final_corrections_applied | quality_harmonization_version | waist_circ_unit_correction_applied | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | JHB_Aurum_009 | 2013-07-15 | 2013.0 | 7.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 23.0 | Male | NaN | NaN | 2013-07-15 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | 13.868 | 21.347 | 7.436 | 12.781 | 21.52 | 12.258 | 11.076 | 10.271 | 1.781 | 0.0 | 0.0 | 14.306 | 28.409 | 29.704 | 31.797 | Winter | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False | 2 |
| 1 | JHB_Aurum_009 | 2014-03-29 | 2014.0 | 3.0 | Autumn | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 32.0 | Female | NaN | NaN | 2014-03-29 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False | 2 |
| 2 | JHB_Aurum_009 | 2014-04-02 | 2014.0 | 4.0 | Autumn | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 49.0 | Male | NaN | NaN | 2014-04-02 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False | 2 |
| 3 | JHB_Aurum_009 | 2014-08-12 | 2014.0 | 8.0 | Winter | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 39.0 | Male | NaN | NaN | 2014-08-12 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False | 2 |
| 4 | JHB_Aurum_009 | 2014-10-28 | 2014.0 | 10.0 | Spring | -25.7479 | 28.2293 | Eastern_JHB | Johannesburg | Gauteng | South Africa | 37.0 | Female | NaN | NaN | 2014-10-28 | South Africa | Tholimpilo_HIV_Linkage_Study | Aurum Institute - Multi-site Gauteng and Limpopo | JHB_Aurum_009 | high | harmonized_datasets | Positive | 1.0 | Tembisa/East Rand (Aurum Institute) | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 2.0 | False | 2 |